Mining HIV protease cleavage data using genetic programming with a sum-product function

نویسندگان

  • Zheng Rong Yang
  • Andrew R. Dalby
  • Jing Qiu
چکیده

MOTIVATION In order to design effective HIV inhibitors, studying and understanding the mechanism of HIV protease cleavage specification is critical. Various methods have been developed to explore the specificity of HIV protease cleavage activity. However, success in both extracting discriminant rules and maintaining high prediction accuracy is still challenging. The earlier study had employed genetic programming with a min-max scoring function to extract discriminant rules with success. However, the decision will finally be degenerated to one residue making further improvement of the prediction accuracy difficult. The challenge of revising the min-max scoring function so as to improve the prediction accuracy motivated this study. RESULTS This paper has designed a new scoring function called a sum-product function for extracting HIV protease cleavage discriminant rules using genetic programming methods. The experiments show that the new scoring function is superior to the min-max scoring function. AVAILABILITY The software package can be obtained by request to Dr Zheng Rong Yang.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Mathematical Programming Model and Genetic Algorithm for a Multi-Product Single Machine Scheduling Problem with Rework Processes

In this paper, a multi-product single machine scheduling problem with the possibility of producing defected jobs, is considered. We concern rework in the scheduling environment and propose a mixed-integer programming (MIP) model for the problem.  Based on the philosophy of just-in-time production, minimization of the sum of earliness and tardiness costs is taken into account as the objective fu...

متن کامل

Searching for discrimination rules in protease proteolytic cleavage activity using genetic programming with a min-max scoring function.

This paper presents an algorithm which is able to extract discriminant rules from oligopeptides for protease proteolytic cleavage activity prediction. The algorithm is developed using genetic programming. Three important components in the algorithm are a min-max scoring function, the reverse Polish notation (RPN) and the use of minimum description length. The min-max scoring function is develop...

متن کامل

Mining association rules for HIV-1 protease cleavage site prediction

Several machine learning techniques, like neural networks, nonlinear support vector machines and decision trees, have been used to model the specificity of HIV-1 protease and to extract specific patterns from peptides cleaved by this protease. Despite many studies, no perfect rules are already known to determine the cleavage of a peptide by HIV-1 protease. These rules are useful for designing s...

متن کامل

Molecular detection of proteolytic activity of human parechovirus 2A protein by gene expression

  Parechoviruses form one of the nine genera in the picornaviridae family, and include two human pathogens: Human parechovirus type1 and 2 (Hpev1 and Hpev2). The genome of picornaviruses encodes a single polyprotein, which undergoes a cleavage cascade performed by virus encoded proteases to give the final virus proteins. The primary cleavage occurs by 2A protein and this step is critical for vi...

متن کامل

A Fast and Self-Repairing Genetic Programming Designer for Logic Circuits

Usually, important parameters in the design and implementation of combinational logic circuits are the number of gates, transistors, and the levels used in the design of the circuit. In this regard, various evolutionary paradigms with different competency have recently been introduced. However, while being advantageous, evolutionary paradigms also have some limitations including: a) lack of con...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 20 18  شماره 

صفحات  -

تاریخ انتشار 2004